Indexing graph-structured XML data for efficient structural join operation

نویسندگان

  • Qun Chen
  • Andrew Lim
  • Kian Win Ong
  • Jiqing Tang
چکیده

Structural join has been established as a primitive technique for matching the binary containment pattern, specifically the parent–child and ancestor–descendant relationship, on the tree XML data. While current indexing approaches and evaluation algorithms proposed for the structural join operation assume the tree-structured data model, the presence of reference links in XML documents may render the underlying model a graph instead. In the more general category of semi-structured data, of which XML is an example, the data model is also usually supposed to be of graph structure. In this paper, we present an indexing approach and corresponding evaluation algorithms for efficiently performing the structural join operation on graph-structured data. Our approach encodes the structural containment relationship of a graph on multiple nested tree-structured layers, probably with the exception of the last one. With each tree-structured layer indexed with the inverted technique, the structural join operation on a graph can therefore be accomplished through recursively performing structural joins on nested layer trees. Our extensive experiments on both benchmark and synthetic XML data indicate that our proposed approach has good potential to perform significantly better than existing ones in term of both the I/O and CPU cost. 2005 Elsevier B.V. All rights reserved. 0169-023X/$ see front matter 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.datak.2005.05.008 * Corresponding author. E-mail addresses: [email protected] (Q. Chen), [email protected] (A. Lim), [email protected] (K.W. Ong), jiqingtang@ gmail.com (J.Q. Tang). 160 Q. Chen et al. / Data & Knowledge Engineering 58 (2006) 159–179

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Securing XML Query Processing Storage

The effective processing of semi-structured data queries is a preliminary part of data mining stage. XML queries employ regular path expressions to find structural patterns within XML documents. The operation of structural join is a crucial part of XML query processing. Existing approaches reduce complex join expressions to several binary structural joins. In this paper, we are proposing a new ...

متن کامل

Subgraph Join: Efficient Processing Subgraph Queries on Graph-Structured XML Document

The information in many applications can be naturally represented as graph-structured XML document. Structural query on graph structured XML document matches the subgraph of graph structured XML document on some given schema. The query processing of graphstructured XML document brings new challenges. In this paper, for the processing of subgraph query, we design a subgraph join algorithm based ...

متن کامل

Labeling Scheme and Structural Joins for Graph-Structured XML Data

When XML documents are modeled as graphs, many challenging research issues arise. In particular, query processing for graphstructured XML data brings new challenges because traditional structural join methods cannot be directly applied. In this paper, we propose a labeling scheme for graph-structured XML data. With this labeling scheme, the reachability relationship of two nodes can be judged e...

متن کامل

Towards Cost-based Optimizations of Twig Content-based Queries

In recent years, many approaches to indexing XML data have appeared. These approaches attempt to process XML queries efficiently and sufficient query plans are built for this purpose. Some effort has been expended in the optimization of XML query processing [20]. There are not many works that take cost-based query optimizations into account. In work [20], we find some cost-based considerations,...

متن کامل

An Efficient XML Index Structure with Bottom-Up Query Processing

With the growing importance of XML in data exchange, much research has been done in proving flexible query mechanisms to extract data from structured XML documents. The semi-structured nature of XML data and the requirements on query flexibility pose unique challenges to database indexing methods. Recently, ViST that uses suffix tree and BTree was proposed to reduce the search time of the docum...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Data Knowl. Eng.

دوره 58  شماره 

صفحات  -

تاریخ انتشار 2006